Method and apparatus for triangle rasterization with clipping and wire-frame mode support

ABSTRACT

A low-cost high-speed programmable rasterizer accepting an input set of functionals representing a triangle, clipping planes and a scissoring box, and producing multiple spans per clock cycle as output. A Loader converts the input set from a general form to a special case form accepted by a set of Edge Generators, the restricted input format accepted by the Edge Generators contributing to their efficient hardware implementation.

BACKGROUND

1. Field

Invention relates generally to rasterizers and, more particularly, toaccelerating the conversion of primitives defined by vertexes toequivalent images composed of pixel patterns that can be stored andmanipulated as sets of bits.

2. Related Art

Raster displays are commonly used in computer graphics systems. Thesedisplays store graphics images as a matrix of the smallest pictureelements that can be displayed on a screen (“pixels”) with datarepresenting each pixel being stored in a display buffer. This dataspecifies the display attributes for each pixel on the screen such asthe intensity and color of the pixel. An entire image is read from thedisplay buffer and displayed on the screen by sequentially scanning outhorizontal rows of pixel data or “scan lines.”

Raster display systems commonly use polygons as basic building blocks or“primitives” for drawing more complex images. Triangles are a commonbasic primitive for polygon drawing systems, since a triangle is thesimplest polygon and more complex polygons can be represented as sets oftriangles. The process of drawing triangles and other geometricprimitives on the screen is known as “rasterization.”

An important part of rasterization involves determining which pixelsfall within a given triangle. Rasterization systems generally step frompixel to pixel in various ways and determine whether or not to “render,”i.e. to draw into a frame buffer or pixel map, each pixel as part of thetriangle. This, in turn, determines how to set the data in the displaybuffer representing each pixel. Various traversal algorithms have beendeveloped for moving from pixel to pixel in a way such that all pixelswithin the triangle are covered.

Rasterization systems sometimes represent a triangle as a set of threeedge-functions. An edge function is a line equation representing astraight line, which serves to subdivide a two-dimensional plane. Edgefunctions classify each point within the plane as falling into one ofthree regions: the region “inside” of the triangle, the region “outside”of the triangle or the region representing the line itself. The type ofedge function that will be discussed has the property that points“inside” of the triangle have a value greater than zero, points“outside” have a value less than zero, and points exactly on the linehave a value of zero. This is shown in FIG. 1 a. Applied torasterization systems, the two-dimensional plane is represented by thegraphics screen, points are represented by individual pixels, and theedge function serves to subdivide the graphics screen.

The union of three edges, or more particularly three half-planes, eachof which is specified by edge functions, create triangles. It ispossible to define more complex polygons by using Boolean combinationsof more than three edges. Since the rasterization of triangles involvesdetermining which pixels to render, a tiebreaker rule is generallyapplied to pixels that lie exactly on any of the edges to determinewhether the pixels are to be considered interior or exterior to thetriangle.

As shown in FIG. 1 b, each pixel has associated with it a set of edgevariables (e₀, e₁ and e₂) which are proportional to the signed distancebetween the pixel and the three respective edges. The value of each edgevariable is determined for a given triangle by evaluating the three edgefunctions, f₀(x,y), f₁(x,y) and f₂(x,y) for the pixel location. It isimportant to note that it can be determined whether or not a pixel fallswithin a triangle by looking at only the signs of e₀, e₁ and e₂.

In determining which pixels to render within a triangle, typicalrasterization systems compute the values of the edge variables (e₀, e₁and e₂) for a given set of three edge functions and a given pixelposition, and then use a set of increment values (Δe_(outside),Δe_(inside), etc.) to determine the edge variable values for adjacentpixels. The rasterization system traverses the triangle, adding theincrement values to the current values as a traversal algorithm stepsfrom pixel to pixel.

With reference again to FIG. 1 a, a line is illustrated that is definedby two points: (X,Y) and (X+dX, Y+dY). As noted above, this line can beused to divide the two dimensional space into three regions: all points“outside” of, “inside” of, and exactly on the line. The edge f(x,y) canbe defined as f(x,y)=(x−X)dY−(y−Y)dX. This function has the usefulproperty that its value is related to the position of the point (x,y)relative to the edge defined by the points (X,Y) and (X+dX, Y+dY):

f(x,y)>0 if (x,y) is “inside”;

f(x,y)=0 if (x,y) is exactly on the line; and

f(x,y)<0 if (x,y) is “outside”.

Existing rasterization systems commonly use this function, since it canbe computed incrementally by simple addition: f(x+1,y)=f(x,y)+dY andf(x,y+1)=f(x,y)−dX.

A variety of different traversal algorithms are presently used bydifferent rasterization systems in the rendering process. Any algorithmguaranteed to cover all of the pixels within the triangle can be used.For example, some solutions involve following the sides of the trianglewhile identifying a horizontal or vertical span of pixels therein.Following the sides of the triangle is adequate for the triangle edges,but if the triangle is clipped by a near or far plane, these boundariesare not known explicitly and cannot be followed as easily as thetriangle edges. Other methods test individual pixels one at a time. Inthe recent past multiple pixels are tested in parallel to speed up therasterization process.

Some conventional rasterizers use span-based pixel generation andcontain edge and span interpolators based on the well-known Bresenhamalgorithm. The speed of those rasterizers depends on the interpolationspeed. Furthermore, they require a complicated setup process. In mostcases such rasterizers interpolate many associated parameters such ascolor, texture, etc. with appropriate hardware. Increasing the speed ofsuch rasterizers requires a significant increase in the number andcomplexity of the interpolators, an approach not suitable for commercialproducts. In the case of clipping support, the structure of suchrasterizers is too complex for efficient implementation.

Another approach is to use area rasterizers based on a definition ofinner and outer pixels, grouped into blocks, with checking cornerpixels' equation values to define inner, border and outer blocks. Thisapproach may accelerate the generation of bit-masks of inner blocks, butthe border blocks either need to be processed pixel by pixel or need asignificant amount of dedicated hardware for processing those pixels inparallel.

Accordingly, there is a need for a low-cost high-speed rasterizer havinga simple and uniform structure and capable of generating multiple spansper clock cycle.

SUMMARY

Invention describes a low-cost high-speed programmable rasterizer. Therasterizer accepts as input a set of functionals representing atriangle, clipping planes and a scissoring box, and produces multiplespans per clock cycle as output. A Loader converts the input set, asexpressed in one of a number of general forms, to an expressionconforming to a special case format as accepted by a set of EdgeGenerators. The restricted input format accepted by the Edge Generatorscontributes to their efficient hardware implementation.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 a is a diagram illustrating a half-plane, according to anembodiment of the present invention.

FIG. 1 b is a diagram illustrating a triangle defined by threehalf-planes, according to an embodiment of the present invention.

FIG. 1 c is a diagram illustrating a polygon defined by a set ofhalf-planes, according to an embodiment of the present invention.

FIG. 1 d is a diagram illustrating an opened half-plane and a closedhalf-plane, according to an embodiment of the present invention.

FIG. 2 is a diagram illustrating normals in quadrants and the definitionof “right” and “left” half-planes, according to an embodiment of thepresent invention.

FIG. 3 is a diagram illustrating a wire-frame triangle, according to anembodiment of the present invention.

FIG. 4 is a flow diagram illustrating a method for the moving-downprocess in preparation the Bresenham setup, according to an embodimentof the present invention.

FIG. 5 is a flow diagram illustrating the foregoing method for theBresenham setup process, according to an embodiment of the presentinvention.

FIG. 6 is a flow diagram illustrating a method for the Bresenham walkprocess, according to an embodiment of the present invention.

FIG. 7 is a block diagram illustrating a Span Generator, according to anembodiment of the present invention.

FIG. 8 is a block diagram illustrating a Loader (without shifters),according to an embodiment of the present invention.

FIG. 9 is a block diagram illustrating {tilde over (b)} and {tilde over(c)} values wrapping before they are loaded into an Edge Generator,according to an embodiment of the present invention.

FIG. 10 a is a block diagram illustrating an Edge Generator, accordingto an embodiment of the present invention.

FIG. 10 b is a block diagram illustrating an Edge Generator during themoving-down phase, according to an embodiment of the present invention.

FIG. 10 c is a block diagram illustrating an Edge Generator during theBresenham setup phase, according to an embodiment of the presentinvention.

FIG. 11 a is a block diagram illustrating a Scissoring Box origin,according to an embodiment of the present invention.

FIG. 11 b is a block diagram illustrating a Scissoring Box, according toan embodiment of the present invention.

DETAILED DESCRIPTION

The following servers as a glossary of terms as defined herein:

-   Triangle Intersection of three half-planes, wherein each half-plane    is “open” or “closed”.-   Polygon Intersection of a triangle and the clipping half-planes    (shown in FIG. 1 c), wherein each clipping half-plane is “open” or    “closed”.-   “Open” half-plane A half-plane which satisfies the inequality (as    shown in FIG. 1 d)    a·x+b·y+c>0  (1)-   “Closed” half-plane A half-plane which satisfies the inequality (as    shown in FIG. 1 d)    a·x+b·y+c≧0  (2)-   Half-plane functional An expression describing a half-plane or a    line    f(x, y)=a·x+b·y+c  (3)-   “Right” half-plane A half-plane described by a functional    f(x, y)=a·x+b·y+c, where a<0    a=0    b<0  (4)-   “Left” half-plane A half-plane described by a functional    f(x, y)=a·x+b·y+c, where a>0    a=0    b>0  (5)-   Scissoring box A rectangle representing a part of the view-port    where polygon are actually drawn.-   Bounding box A smallest rectangle to fit the intersection of a    triangle and the scissoring box-   Extended bounding box A bounding box, which horizontal size is the    smallest power of 2, which is greater or equal to the size of the    bounding box-   w The horizontal size of the bounding box    w=x _(max) −x _(min)  (6)-   W The horizontal size of the extended bounding box, for which it    could be expressed as:    W=2^(ceiling (log) ² ^(w))  (7)-   x Representation of the integer horizontal coordinate inside the    bounding box expressed in current grid units-   y Representation of the integer vertical coordinate inside the    bounding box expressed in current grid units-   x_(min) Representation of the minimal horizontal coordinate of the    bounding box-   y_(min) Representation of the minimal vertical coordinate of the    bounding box-   a, b, c Integer coefficients of the functional of the half-plane-   ã, {tilde over (b)}, {tilde over (c)} Integer coefficients of the    functional transformed to the bounding box relative coordinates    according to the special case of the edge functional-   “Edge” of a “left” half-plane The set of points (x_(i), y_(i))    satisfying the expression $\begin{matrix}    {{x_{i} = {\min\limits_{x \in Z}\{ {x:{{{a \cdot x} + {b \cdot y_{i}} + c} \geq 0}} \}}}{{{{{where}\quad a} > {0\quad{and}{\quad\quad}i}} = 0},1,\ldots\quad,{y_{\max} - y_{\min}},{or}}} & (8) \\    {{y_{i} = {\min\limits_{y \in Z}\{ {y:{{{a \cdot x_{i}} + {b \cdot y} + c} \geq 0}} \}}}{{{{where}\quad a} = {{{0\quad{and}{\quad\quad}b} > {0\quad{and}\quad i}} = 0}},1,\ldots\quad,{x_{\max} - x_{\min}}}} & (9)    \end{matrix}$-   “Left” edge “Edge” of a “left” half-plane-   “Edge” of a “right” half-plane The set of points (x_(i), y_(i))    satisfying the expression $\begin{matrix}    {{x_{i} = {\max\limits_{x \in Z}\{ {x:{{{a \cdot x} + {b \cdot y_{i}} + c} > 0}} \}}}{{{{{where}\quad a} < {0\quad{and}\quad i}} = 0},1,\ldots\quad,{y_{\max} - y_{\min}},{or}}} & (10) \\    {{y_{i} = {\max\limits_{y \in Z}\{ {y:{{{a \cdot x_{i}} + {b \cdot y} + c} > 0}} \}}}{{{{where}\quad a} = {{{0\quad{and}\quad b} < {0\quad{and}{\quad\quad}i}} = 0}},1,\ldots\quad,{x_{\max} - x_{\min}}}} & (11)    \end{matrix}$-   “Right” edge “Edge” of a “right” half-plane-   “Edge” of a half-plane If the half-plane is a “right” half-plane,    then the “edge” of the “right” half-plane, otherwise the “edge” of    the “left” half-plane-   “Edge” of a polygon “Edge” of one of the half-planes forming the    polygon-   Wire-frame A disjunction of three parallelograms based on the three    edges of the triangle-   “Width” of a wire-frame Integer number, which expresses in the    current grid units projection of the width of the wire-frame line to    a minor direction axis of the current grid.-   d The width of the wire-frame line-   Edge Generator EG State machine to generate an edge of a half-plane,    which computes a sequence of x coordinate values in order of    incrementing y coordinate associated with one of the functionals-   Loader Pipelined device to transform input functionals to the form,    which is convenient for EG to work-   Sorter Pipelined device to compute the intersection of half-planes,    edges of which are generated by several EG-   Span buffer Temporary storage for spans before tiling-   Tiling Process of making tiles-   Tile Set of 8×8 pixels, aligned by x and y coordinates-   Tile Generator TG State machine to produce tiles from spans in Span    Buffer-   Moving Down Phase of the EG when EG is adding {tilde over (b)} value    to the functional value each clock until the functional value is    positive-   SHORT Data type to define signed 22-bit numbers-   LONG Data type to define signed 42-bit numbers-   BITN Data type to define unsigned N-bit numbers-   BITNS Data type to define signed N-bit numbers    Triangle Edge Definition    We assume that triangle edge functions are defined as    $\begin{matrix}    {{{F_{i}( {x,y} )} = {{{a_{i} \cdot x} + {b_{i} \cdot y} + c_{i}} = {\det\begin{pmatrix}    x_{j} & x_{k} & x \\    y_{j} & y_{k} & y \\    1 & 1 & 1    \end{pmatrix}}}},{i = 0},1,2} & (12)    \end{matrix}$    wherein j=(i+1) mod 3, k=(i+2) mod 3 and [x_(i), y_(i)], i=0, 1, 2    are triangle vertex coordinates in a standard window coordinate    system expressed in the units of the main grid (see above). If the    functionals are set up as “implicit” clipping functionals, they    should be converted to this format as well.    End Points    For a right edge and a given span y_(i) the interpolator should    produce x_(i) such that $\begin{matrix}    {x_{i} = {\max\limits_{x \in Z}\{ {x:{{{a \cdot x} + {b \cdot y_{i}} + c} > 0}} \}}} & (13)    \end{matrix}$    such an x_(i) point for a≠0 is the last (inclusive) point of the    span.    For a left edge and a given span y_(i) the interpolator should    produce x_(i) such that $\begin{matrix}    {x_{i} = {\min\limits_{x \in Z}\{ {x:{{{a \cdot x} + {b \cdot y_{i}} + c} \geq 0}} \}}} & (14)    \end{matrix}$    such an x_(i) point for a≠0 is the first (inclusive) point of the    span.    If we have a=0 then the edge (left or right) is horizontal, thus the    end points of the span for the functional will be x₀=0 and x₀=W.    General Cases for the Edge Generator

In general case we have opened right half-planes and closed lefthalf-planes, classified as follows, also shown in FIG. 2: Normal Case #Half-plane quadrant A B 1 Right open II <0 ≧0 2 Right open III <0 <0 3Right open III =0 <0 4 Left closed IV >0 ≦0 5 Left closed I >0 >0 6 Leftclosed I =0 >0 7 Whole n/a =0 =0 bounding box is inside or outside theplaneA Loader 102 transforms a functional given according to a general caseinto a functional given by the special case, with the special case andthe general cases described as follows:

Special Case for the Edge GeneratorThe Edge Generator 103 (shown in FIG. 7) operates within a discretespace with integer coefficients. To simplify the work of the EdgeGenerator 103, the Edge Generator 103 is designed to draw an edge of aclosed right half-plane (i.e. a right edge), whose normal is located inquadrant II since we have a<0 and b≧0. For an edge with a<0 we have:$\begin{matrix}{{f( {x,y} )} = {{{a \cdot x} + {b \cdot y} + c} = { 0\quad\Rightarrow\quad x  = {{{- \frac{b}{a}} \cdot y} - \frac{c}{a}}}}} & (15)\end{matrix}$from which we calculate $\begin{matrix}\begin{matrix}{x_{0} = {{floor}\quad( {- \frac{c}{a}} )}} \\{{\Delta\quad x} = {{floor}\quad( {- \frac{b}{a}} )}}\end{matrix} & (16)\end{matrix}$

The Edge Generator 103 works in a vertical stripe [x_(min), x_(max)]using an x coordinate relative to x_(min) which satisfies 0≦x≦W, whereinW=2^(m) is the size of the extended bounding box. As described below,the setup division starts as soon as the functional changes sign fromnegative to positive and f(0, y)≧0, hence resulting in x₀≧0. Also, Δx≧0according to the above assumption that a<0 and b≧0.

It is possible that the value of the functional ƒ is negative when theEdge Generator 103 starts operating (i.e. when y=0). In this case, thex₀ value could be negative and hence does not need to be computed, sincewe are only interested in the exact x₀ values which satisfy 0≦x₀≦W.

The completion of the moving-down process is followed by calculatingx₀=floor(−c/a) and Δx=floor(−b/a) using a division process performed bythe divider. Since the divider starts operating when the functionalvalue changes its sign from negative to positive, we can assume that atthe start of the division process ƒ(0, y)≧0. To calculate x₀ and Δx thedivider operates under the assumption thata<0, b≧0  (17)and uses a simple adder-based divider. Since a<0, we takec _(i)−(−a _(i))≡c _(i) +a _(i)b _(i)−(−a _(i))≡b _(i) +a _(i)  (18)into consideration start withc ₀ =f(x, y), a ₀ =ã·2^(m+1) , b ₀ ={tilde over (b)}, x ₀₀ =Δx ₀=0  (19)and then iterate as follows: $\begin{matrix}\begin{matrix}{c_{i + 1} = {c_{i} + \{ \begin{matrix}{a_{i},{{c_{i} + a_{i}} \geq 0}} \\{0,{{c_{i} + a_{i}} < 0}}\end{matrix} }} \\{x_{{0i} + 1} = {{x_{0i} \cdot 2} + \{ \begin{matrix}{1,{{c_{i} + a_{i}} \geq 0}} \\{0,{{c_{i} + a_{i}} < 0}}\end{matrix} }} \\{b_{i + 1} = {b_{i} + \{ {\begin{matrix}{a_{i},{{b_{i} + a_{i}} \geq 0}} \\{0,{{b_{i} + a_{i}} < 0}}\end{matrix},{i = 0},1,\ldots\quad,m} }} \\{{\Delta\quad x_{i + 1}} = {{\Delta\quad{x_{i} \cdot 2}} + \{ \begin{matrix}{1,{{b_{i} + a_{i}} \geq 0}} \\{0,{{b_{i} + a_{i}} < 0}}\end{matrix} }} \\{a_{i + 1} = \frac{a_{i}}{2}}\end{matrix} & (20)\end{matrix}$describing the fully functional step-by-step integer divider.

Case 1: Right Open Half-plane and A<0

B≧0

The difference between this case and the special case is only that thehalf-plane is open.

Therefore we need to find $\begin{matrix}{x_{0} = {\max\limits_{x \in Z}\{ {x:{{{a \cdot x} + {b \cdot y} + c} > 0}} \}}} & (21)\end{matrix}$Since the coefficients and variables are integer, $\begin{matrix}{x_{0} = {\max\limits_{x \in Z}\{ {x:{{{a \cdot x} + {b \cdot y} + c - 1}\quad \geq 0}} \}}} & (22)\end{matrix}$and therefore $\begin{matrix}{{x_{0} = {\max\limits_{x \in Z}\{ {{{x\text{:}{a \cdot x}} + {b \cdot y} + \overset{\sim}{c}} \geq 0} \}}},{\overset{\sim}{c} = {c - 1}}} & (23)\end{matrix}$which reduces this case to the special case. Thus, in this case theLoader 102 (shown in FIG. 7) subtracts 1 from c before starting the EdgeGenerator 103.

Case 2: Right Open Half-plane and A<0

B<0Again we need to find $\begin{matrix}{x_{0} = {\max\limits_{x \in Z}\{ {{{x\text{:}{a \cdot x}} + {b \cdot y} + c} > 0} \}}} & (24)\end{matrix}$Substituting x=W−{tilde over (x)} we have $\begin{matrix}{x_{0} = {W - {\min\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}} - {a \cdot \overset{\sim}{x}} + {b \cdot y} + c + {W \cdot a}} > 0} \}}}} & (25)\end{matrix}$and computing maximum in the complimentary semi-plane $\begin{matrix}{x_{0} = {W - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}} - {a \cdot \overset{\sim}{x}} + {b \cdot y} + c + {W \cdot a}} \leq 0} \}} - 1}} & (26)\end{matrix}$and rewriting the constraint and collecting appropriate terms we have$\begin{matrix}{x_{0} = {W - 1 - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}{a \cdot \overset{\sim}{x}}} - {b \cdot y} - c - {W \cdot a}} \geq 0} \}}}} & (27)\end{matrix}$and finally $\begin{matrix}{x_{0} = {W - 1 - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}{\overset{\sim}{a} \cdot \overset{\sim}{x}}} + {\overset{\sim}{b} \cdot y} + \overset{\sim}{c}} \geq 0} \}}}} & (28)\end{matrix}$whereã=a, {tilde over (b)}=−b, {tilde over (c)}=−c−W·a  (29)which reduces this case to the special case.

Case 3: Right Open Half-plane and A=0

B<0Whereas in the previous case for a<0

b<0 we had $\begin{matrix}{x_{0} = {W - 1 - {\max\limits_{\overset{\overset{\sim}{\sim}}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}{\overset{\sim}{a} \cdot \overset{\sim}{x}}} + {\overset{\sim}{b} \cdot y} + \overset{\sim}{c}} \geq 0} \}}}} & (30)\end{matrix}$whereinã=a, {tilde over (b)}=−b, {tilde over (c)}=−c−W·a  (31)In this case we have a=0, which means that (30) does not have a maximum.However the division algorithm described above (16) is stable in thecase of a zero denominator, producing in this case{tilde over (x)} ₀=2·W−1

x ₀ =W−1−{tilde over (x)} ₀ =−W  (32)after the completion of the division algorithm, indicating that the xvalue reaches the other edge of the bounding box and that the EdgeGenerator 103 will draw a horizontal line.

Case 4: Left Closed Half-plane and A>0

B≦0Again we want to find $\begin{matrix}{x_{0} = {\max\limits_{x \in Z}\{ {{{x\text{:}{a \cdot x}} + {b \cdot y} + c} \geq 0} \}}} & (33)\end{matrix}$or equivalently $\begin{matrix}{x_{0} = {\min\limits_{x \in Z}\{ {{{x\text{:}} - {a \cdot x} - {b \cdot y} - c} \leq 0} \}}} & (34)\end{matrix}$Substituting a=−ã, b=−{tilde over (b)} and computing the maximum in thecomplimentary semi-plane, we have $\begin{matrix}{x_{0} = {{\max\limits_{x \in Z}\{ {{{x\text{:}{\overset{\sim}{a} \cdot x}} + {\overset{\sim}{b} \cdot y} - c} > 0} \}} + 1}} & (35)\end{matrix}$Since the coefficients and variables are integer, we have$\begin{matrix}{x_{0} = {{\max\limits_{x \in Z}\{ {{{x\text{:}{\overset{\sim}{a} \cdot x}} + {\overset{\sim}{b} \cdot y} - c - 1} \geq 0} \}} + 1}} & (36)\end{matrix}$and therefore $\begin{matrix}{x_{0} = {{\max\limits_{x \in Z}\{ {{{x\text{:}{\overset{\sim}{a} \cdot x}} + {\overset{\sim}{b} \cdot y} + \overset{\sim}{c}} \geq 0} \}} + 1}} & (37)\end{matrix}$whereinã=−a, {tilde over (b)}=−b, {tilde over (c)}=−c−1  (38)reducing to the special case.

Case 5: Left Closed Half-plane and A>0

B>0We want to find $\begin{matrix}{x_{0} = {\min\limits_{x \in Z}\{ {{{x\text{:}{a \cdot x}} + {b \cdot y} + c} \geq 0} \}}} & (39)\end{matrix}$Substituting x=W−{tilde over (x)} we have $\begin{matrix}{x_{0} = {W - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}} - {a \cdot \overset{\sim}{x}} + {b \cdot y} + c + {W \cdot a}} \geq 0} \}}}} & (40)\end{matrix}$and therefore $\begin{matrix}{x_{0} = {W - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}{\overset{\sim}{a} \cdot \overset{\sim}{x}}} + {\overset{\sim}{b} \cdot y} + \overset{\sim}{c}} \geq 0} \}}}} & (41)\end{matrix}$whereinã=−a, {tilde over (b)}=b, {tilde over (c)}=c+W·a  (42)reducing to the special case.

Case 6: Left Closed Half-plane and A=0

B>0In the previous case for a>0

b>0 we had $\begin{matrix}{x_{0} = {W - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}{\overset{\sim}{a} \cdot \overset{\sim}{x}}} + {\overset{\sim}{b} \cdot y} + \overset{\sim}{c}} \geq 0} \}}}} & (43)\end{matrix}$whereinã=−a, {tilde over (b)}=b, ã=c+W·a  (44)In this case we have a=0 resulting in (43) having no maximum. However,the division algorithm described above (16) is again stable in this caseof zero denominator, resulting in{tilde over (x)} ₀=2·W−1

x ₀ =W−1−{tilde over (x)} ₀=0  (45)after the division algorithm completes, indicating that the x valuereaches the other edge of the bounding box and that the Edge Generator103 will draw a horizontal line.

Case 7: The Plane of the Polygon is Parallel to the Clipping Plane andA=0

B=0

This case indicates that the plane of the polygon is parallel to one ofthe clipping planes. In this case the sign of c determines whether theplane of the polygon is visible or not. If c<0, then the entire boundingbox is invisible. The Edge Generator 103 will function normally, but allspans will be marked as being “outside the bounding box”. Otherwise, allspans will be marked as being “inside the bounding box”.

Wire-frame Support

The next two cases involve wire-frame support. FIG. 3 is a diagramillustrating a wire-frame of a triangle, according to one embodiment ofthe present invention. The wire-frame of a triangle is a disjunction ofthree parallelograms, each of which represents an edge of the triangle.We assume that a wire-frame to be drawn comprises a one-pixel linewidth. The wire-frame support reliably works in the followingconditions: (a) no over-sampling (i.e. the current grid is the same asthe pixel grid), and (b) the width of the wire-frame is one unit of thecurrent grid (i.e. one pixel according to the foregoing assumption). Ifthe wire-frame support works for any other mode (either over-sampling ison or the width is more than one) we consider the availability of thosemodes a bonus, which we suppose to get almost for free.

We restrict the wire-frame mode as not comprising any clippingfunctionals besides a frustum. This means that a wire-framed trianglecomprises (a) three functionals representing the triangle edges and (b)the bounding box.

A wire-framed triangle comprises three parameters for drawing:

-   Width The width of an edge, expressed as the number of pixels to be    covered by a triangle edge in the minor direction. A Span Generator    101 (shown in FIG. 7) correctly processes a wire-frame with a    one-pixel width.-   Edge flag Draw-edge flag (one bit per edge). Each edge of the    triangle is equipped with a draw-edge flag, indicating whether the    edge is to be drawn.-   Extension Bounding box extension. If the draw-edge flag is set for    an edge, the bounding box is extended by half of the wire-frame line    width.    The wire-frame is an intersection of the “tight” bounding box and an    exclusive intersection of two closed-edges triangles. Since the    original functionals specify the center-line of each edge of the    wire-framed triangle, the functionals for the wire-frame are offset    by half of the wire-frame width in the “minor” direction, i.e. in    the direction of that coordinate whose coefficient in the functional    has a smaller absolute value:

Case 8: Right Closed Half-plane for Wire-frame and A<0

B≧0There is no difference between this case and the special case, so weneed to make no corrections for this case $\begin{matrix}{x_{0} = {\max\limits_{x \in Z}\{ {{{x\text{:}{a \cdot x}} + {b \cdot y} + c} \geq 0} \}}} & (46)\end{matrix}$

Case 9: Right Closed Half-plane for Wire-frame and A≦0

B<0Again we need to find $\begin{matrix}{x_{0} = {\max\limits_{x \in Z}\{ {{{x\text{:}{a \cdot x}} + {b \cdot y} + c} \geq 0} \}}} & (47)\end{matrix}$Substituting x=W−{tilde over (x)} $\begin{matrix}{x_{0} = {W - {\min\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}} - {a \cdot \overset{\sim}{x}} + {b \cdot y} + c + {W \cdot a}} \geq 0} \}}}} & (48)\end{matrix}$and computing maximum in the complimentary semi-plane $\begin{matrix}{x_{0} = {W - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}} - {a \cdot \overset{\sim}{x}} + {b \cdot y} + c + {W \cdot a}} < 0} \}} - 1}} & (49)\end{matrix}$and rewriting the constraint and collecting appropriate terms we resultsin $\begin{matrix}{x_{0} = {W - 1 - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}{a \cdot \overset{\sim}{x}}} - {b \cdot y} - c - {W \cdot a}} > 0} \}}}} & (50)\end{matrix}$and finally $\begin{matrix}{x_{0} = {W - 1 - {\max\limits_{\overset{\sim}{x} \in Z}\{ {{{\overset{\sim}{x}\text{:}{\overset{\sim}{a} \cdot \overset{\sim}{x}}} + {\overset{\sim}{b} \cdot y} + \overset{\sim}{c}} \geq 0} \}}}} & (51)\end{matrix}$whereinã=a, {tilde over (b)}=−b, {tilde over (c)}=−c−W·a−1  (52)reducing once again to the special case.The Loader

The Edge Generator 103 works under the assumption of the special casedescribed above, allowing significant reduction of its hardware andresulting in faster operation. The Loader 102 is the element whichtransforms a general case to the special case, converting an inputfunctional described by a general case into a form expected by thespecial case, thereby allowing the Edge Generator 103 to compute edgevalues correctly and efficiently.

The Loader 102 accepts as inputs a functional and a bounding box offset,and produces a set of coefficients a, b, and c according to the specialcase for the Edge Generator 103.

We have:F(x, y)=a·x′+b·y′+c′X∈[x_(min), x_(max)], [y_(min), y_(max)]  (53)Since the functional coefficients are expressed in the main grid and thex, y coordinates are expressed in the over-sampling grid, we have a gridratio of s=2^(6+[0, 1, 2]) and will convert the c′ value to theover-sampling grid. The particular conversion depends on the type of thehalf-plane at hand. For a closed half-plane the conversion is asfollows: $\begin{matrix}{{f( {x,y} )} =  {{{a \cdot x^{\prime}} + {b \cdot y^{\prime}} + c^{\prime}} \geq 0}\Rightarrow } & (54) \\{{{{{a \cdot s \cdot x} + {b \cdot s \cdot y} + c^{\prime}} \geq 0},\quad{x^{\prime} = {s \cdot x}},\quad{y^{\prime} =  {s \cdot y}\Rightarrow }}\quad} & \quad \\ {{{a \cdot x} + {b \cdot y} + \frac{c^{\prime}}{s}} \geq 0}\Rightarrow  & \quad \\{{{{a \cdot x} + {b \cdot y} + c} \geq 0},\quad{c = {{floor}\quad( \frac{c^{\prime}}{s} )}}} & \quad\end{matrix}$For an opened half-plane the conversion is as follows: $\begin{matrix}{{f( {x,y} )} =  {{{a \cdot x^{\prime}} + {b \cdot y^{\prime}} + c^{\prime}} > 0}\Rightarrow } & (55) \\{{{{{a \cdot s \cdot x} + {b \cdot s \cdot y} + c^{\prime}} > 0},\quad{x^{\prime} = {s \cdot x}},\quad{y^{\prime} =  {s \cdot y}\Rightarrow }}\quad} & \quad \\ {{{a \cdot x} + {b \cdot y} + \frac{c^{\prime}}{s}} > 0}\Rightarrow  & \quad \\{{{{a \cdot x} + {b \cdot y} + c} > 0},\quad{c = {{ceiling}\quad( \frac{c^{\prime}}{s} )}}} & \quad\end{matrix}$

It is an advantageous aspect of the present invention that two or moreEdge Generators 103 may participate in span generation for the samefunctional. In one embodiment of the present invention, wherein k=1(respectively 2 or 4) Edge Generators 103 participate in the spangeneration for the same functional, we want the first span of the 2(respectively 4 or 8) spans generated per clock cycle to be aligned by ycoordinate by 2 (respectively 4 or 8) accordingly. To accomplish this,denote${{\overset{\sim}{y}}_{\min} = {{{floor}{\quad\quad}( \frac{y_{\min}}{k \cdot 2} )} \cdot k \cdot 2}},$and substitute {tilde over ({tilde over (x)})}=x−x_(min), {tilde over({tilde over (y)})}=y−y_(min), {tilde over ({tilde over(c)})}=c−a·x_(min)−b·{tilde over (y)}_(min) to obtainf({tilde over ({tilde over (x)})}, {tilde over ({tilde over(y)})})=a·{tilde over ({tilde over (x)})}+b·{tilde over ({tilde over(y)})}+{tilde over ({tilde over (c)})}  (56)The size of the bounding box is (x_(max)−x_(min))·(y_(max)−y_(min)).Here we takem=ceiling (log₂(x _(max) −x _(min)))  (57)W=2^(m)  (58)Observing the above cases, taking (23), (52), (38) and (42) inconsideration and uniting common expressions results inã=−|a|  (59){tilde over ({tilde over (b)})}=|b|  (60) $\begin{matrix}{\overset{\sim}{c} = \{ \begin{matrix}{{\overset{\overset{\sim}{\sim}}{c} - 1},} & {a < {0\bigwedge b} \geq 0} \\{{{- \overset{\overset{\sim}{\sim}}{c}} - {a \cdot W}},} & {a \leq {0\bigwedge b} < 0} \\{{{- \overset{\overset{\sim}{\sim}}{c}} - 1},} & {a > {0\bigwedge b} \leq 0} \\{{\overset{\overset{\sim}{\sim}}{c} + {a \cdot W}},} & {a \geq {0\bigwedge b} > 0}\end{matrix} } & (61)\end{matrix}$The number of c values generated according to the foregoing descriptioncorresponds to the number of spans that are to be generated per clockcycle, wherein an Edge Generator 103 generates two spans per clockcycle. Each Edge Generator's 103 spans are to be aligned by y such thatthe first span is even (i.e. y_(min) mod 2=0) and the second is odd(i.e. y_(min) mod 2=1). If the y_(min) of the bounding box is odd, spangeneration starts from y_(min)−1. To accomplish that, denotec ₀ ⁰ ={tilde over (c)}−{tilde over ({tilde over (b)})}·( y _(min) mod2)c ₁ ⁰ =c ₀ ⁰ +{tilde over ({tilde over (b)})}{tilde over (b)}=2·{tilde over ({tilde over (b)})}  (62)In the case of more than one Edge Generator 103 participating in spangeneration for the functional, we need to have more than one set ofinitial values for the spans. Assuming the number of Edge Generators 103is k (wherein k=1, 2 or 4), the set of initial values is given byc_(j) ^(i) =c _(j) ⁰+2·{tilde over ({tilde over (b)})}·i, i=1, . . . ,k, j=0, 1{tilde over (b)}=k·{tilde over ({tilde over (b)})}  (63)and the Edge Generators 103 participating in the span generation for thegiven functional are loaded with the initial values of c_(j) ^(i),{tilde over (b)} and ã.Moving DownBefore the Bresenham traversal, an Edge Generator 103 performs twooperations: moving-down and Bresenham setup. The initial values aref({tilde over (x)}, {tilde over (y)})=ã·{tilde over (x)}+{tilde over(b)}·{tilde over (y)}+{tilde over (c)}, {tilde over (x)}=0, {tilde over(y)}=0  (64)with the goal of computing for each given {tilde over (y)}$\begin{matrix}{{{\overset{\sim}{x}}_{0} = {{floor}{\quad\quad}( {- \frac{{\overset{\sim}{b} \cdot \overset{\sim}{y}} + \overset{\sim}{c}}{\overset{\sim}{a}}} )}},\quad{\overset{\sim}{c} \geq 0}} & (65)\end{matrix}$Additionally, an Edge Generator 103 generates an {tilde over (x)} insidethe bounding box. Therefore, if x₀ is outside the bounding box, x₀ issubstituted by 0 or W such that $\begin{matrix}{\overset{\sim}{x} = \{ \begin{matrix}{0,} & {{\overset{\sim}{x}}_{0} < 0} \\{{\overset{\sim}{x}}_{0},} & {{\overset{\sim}{x}}_{0} \in \lbrack {0,W} \rbrack} \\{W,} & {{\overset{\sim}{x}}_{0} > W}\end{matrix} } & (66)\end{matrix}$After converting to a special case within the bounding box, we havef({tilde over (x)}, {tilde over (y)})<0 for the points above the edge(represented by the functional) and f({tilde over (x)}, {tilde over(y)})≧0 on or below the edge, wherein “above” refers to smaller ycoordinates and “below” refers to greater y coordinates. We also haveb≧0 and a<0 as given by the special case conditions.

FIG. 4 is a flow diagram illustrating a method for the moving-downprocess in preparation the Bresenham setup, according to an embodimentof the present invention. The moving-down process starts 200 with {tildeover (y)}_(k)=0. If 201 the functional value f(0, {tilde over(y)}_(k))≧0 the moving-down process is complete 202. Otherwise 203, movedown along the {tilde over (x)}=0 border of the bounding box by adding204 {tilde over (b)} to the functional value at the rate of oneincrement per clock cycle (wherein {tilde over (b)}≧0 and a {tilde over(b)} increment of the functional value corresponds to incrementing y by1 until ƒ(0, {tilde over (y)}_(k))≧0 201 (wherein k>i), at which pointthe moving-down process is 202 complete. The moving-down process isrepresented by the following iterative description:{tilde over (y)}₀=0f ₀ =f(0, 0)=ã·0+{tilde over (b)}·0+{tilde over (c)}={tilde over (c)}f _(i) =f(0, i)={tilde over (b)}·i+f ₀ ={tilde over (b)}·(i−1)+{tildeover (b)}+f₀ =f _(i−1) +{tilde over (b)}  (67)Bresenham Setup

The moving-down process is followed by the Bresenham setup process. Thepurpose of the Bresenham setup is to find the two values $\begin{matrix}{x_{0} = {\max\limits_{x \in Z}\{ {{{\overset{\sim}{x}\text{:}{\overset{\sim}{a} \cdot \overset{\sim}{x}}} + {\overset{\sim}{b} \cdot {\overset{\sim}{y}}_{k}} + \overset{\sim}{c}} \geq 0} \}}} & (68)\end{matrix}$and $\begin{matrix}{{\Delta\quad x} = {{floor}\quad( {- \frac{\overset{\sim}{b}}{\overset{\sim}{a}}} )}} & (69)\end{matrix}$Furthermore, since{tilde over (b)}·{tilde over (y)} _(k) +{tilde over (c)}=f(0, {tildeover (y)} _(k))  (70)we obtain $\begin{matrix}{x_{0} = {{floor}( \frac{f( {0,{\overset{\sim}{y}}_{k}} )}{- \overset{\sim}{a}} )}} & (71)\end{matrix}$The division algorithm described above (see Special Case) is modified asfollows for more efficient hardware implementation:c ₀ =f(0, {tilde over (y)} _(k)), a ₀ =ã·2^(m+1) , b ₀ ={tilde over(b)}, x ₀₀ =Δx ₀=0  (72)with the following steps describing the iterations: $\begin{matrix}{\begin{matrix}{c_{i + 1} = {2 \cdot \{ \begin{matrix}{{c_{i} + a_{0}},} & {{c_{i} + a_{0}} \geq 0} \\{c_{i},} & {{c_{i} + a_{0}} < 0}\end{matrix} }} \\{x_{{0i} + 1} = {{2 \cdot x_{0i}} + \{ \begin{matrix}{1,} & {{c_{i} + a_{0}} \geq 0} \\{0,} & {{c_{i} + a_{0}} < 0}\end{matrix} }} \\{b_{i + 1} = {2 \cdot \{ \begin{matrix}{{b_{i} + a_{0}},} & {{b_{i} + a_{0}} \geq 0} \\{b_{i},} & {{b_{i} + a_{0}} < 0}\end{matrix} }} \\{{\Delta\quad x_{i + 1}} = {{{2 \cdot \Delta}\quad x_{i}} + \{ \begin{matrix}{1,} & {{b_{i} + a_{0}} \geq 0} \\{0,} & {{b_{i} + a_{0}} < 0}\end{matrix} }}\end{matrix},{i = 1},2,\ldots\quad,{m + 1}} & (73)\end{matrix}$The values e₀=c_(m+1)=f(0, {tilde over (y)}_(k)) mod |a| andr₀=b_(m+1)=|b| mod |a| are used in the Bresenham walk (described below)for calculating the Bresenham error. The value$x_{0} = {{floor}( \frac{f( {0,{\overset{\sim}{y}}_{k}} )}{- \overset{\sim}{a}} )}$is the x value for the first span after the moving-down process.The value${\Delta\quad x} = {{floor}( {- \frac{\overset{\sim}{b}}{\overset{\sim}{a}}} )}$is the span-to-span x-increment value. FIG. 5 is a flow diagramillustrating the foregoing method for the Bresenham setup process,according to an embodiment of the present invention.Bresenham Walk

The Bresenham walk is the process following the moving-down andBresenham setup processes. After the Bresenham setup we haveã·{tilde over (x)}+{tilde over (b)}·{tilde over (y)}+{tilde over(c)}=0  (74)whereine ₀ =f(0, {tilde over (y)} _(k)) mod |a|  (75)$\begin{matrix}{x_{0} = {{floor}( \frac{f( {0,{\overset{\sim}{y}}_{k}} )}{- \overset{\sim}{a}} )}} & \quad \\{r_{0} = {{b}\quad{mod}\quad{a}}} & \quad \\{{\Delta\quad x} = {{floor}( {- \frac{\overset{\sim}{b}}{\overset{\sim}{a}}} )}} & \quad \\{and} & \quad \\{{{\overset{\sim}{a} \cdot {\overset{\sim}{x}}_{n}} + {\overset{\sim}{b} \cdot {\overset{\sim}{y}}_{n + k}} + \overset{\sim}{c}} = {{{\overset{\sim}{a} \cdot {\overset{\sim}{x}}_{n}} + {\overset{\sim}{b} \cdot {\overset{\sim}{y}}_{n + k}} + {f( {0,{\overset{\sim}{y}}_{k}} )} - {\overset{\sim}{b} \cdot {\overset{\sim}{y}}_{k}}} =  0\Leftrightarrow }} & (76) \\{{{\overset{\sim}{a} \cdot {\overset{\sim}{x}}_{n}} + {\overset{\sim}{b} \cdot {\overset{\sim}{y}}_{n}} + {f( {0,{\overset{\sim}{y}}_{k}} )}} = 0} & (77)\end{matrix}$and we want to find $\begin{matrix}{\begin{matrix}{{\overset{\sim}{x}}_{n} =  {{{- \frac{\overset{\sim}{b}}{\overset{\sim}{a}}} \cdot {\overset{\sim}{y}}_{n}} - \frac{f( {0,{\overset{\sim}{y}}_{k}} )}{\overset{\sim}{a}}}\Leftrightarrow } \\{{\overset{\sim}{x}}_{n} =  {\frac{f( {0,{\overset{\sim}{y}}_{0}} )}{a} + {\frac{b}{a} \cdot {\overset{\sim}{y}}_{n}}}\Leftrightarrow } \\{{\overset{\sim}{x}}_{n} =  {x_{0} + \frac{e_{0}}{a} + {\Delta\quad{x \cdot n}} + {\frac{r_{0}}{a} \cdot n}}\Leftrightarrow } \\{{\overset{\sim}{x}}_{n} =  {{\overset{\sim}{x}}_{n - 1} + {\Delta\quad x} + \frac{e_{n - 1} + r_{0}}{a}}\Rightarrow } \\{{\overset{\sim}{x}}_{n} = {{\overset{\sim}{x}}_{n - 1} + {\Delta\quad x} + \{ \begin{matrix}{0,} & {{e_{n - 1} + r_{0}} < {a}} \\{1,} & {{e_{n - 1} + r_{0}} \geq {a}}\end{matrix} }} \\{e_{n} = {e_{n - 1} + r_{0} - \{ \begin{matrix}{0,} & {{e_{n - 1} + r_{0}} < {a}} \\{{a},} & {{e_{n - 1} + r_{0}} \geq {a}}\end{matrix} }}\end{matrix},{n = 1},2,\ldots\quad,{h - y_{k}}} & (78)\end{matrix}$wherein h represents a height of the bounding box and y_(k) representsthe value of the y coordinate at the Bresenham setup point. To simplifythe hardware, the error value is decremented by |a| at the beginning ofthe Bresenham walk, after which e_(n) can be compared to 0, with thecomparison being simpler to implement in hardware. We also calculater _(i) =r ₀ −|a|{tilde over (e)} ₀ =e ₀ +r ₀ −|a|  (79)after which the Bresenham walk is more simply described as follows:$\begin{matrix}{\begin{matrix}{{\overset{\sim}{x}}_{n} = {{\overset{\sim}{x}}_{n - 1} + {\Delta\quad x} + \{ \begin{matrix}{0,} & {{\overset{\sim}{e}}_{n - 1} < 0} \\{1,} & {{\overset{\sim}{e}}_{n - 1} \geq 0}\end{matrix} }} \\{{\overset{\sim}{e}}_{n} = {{\overset{\sim}{e}}_{n - 1} + \{ \begin{matrix}{r_{0},} & {{\overset{\sim}{e}}_{n - 1} < 0} \\{r_{1},} & {{\overset{\sim}{e}}_{n - 1} \geq 0}\end{matrix} }}\end{matrix},{n = 1},2,\ldots\quad,{h - y_{k}}} & (80)\end{matrix}$FIG. 6 is a flow diagram illustrating a method for the Bresenham walkprocess, according to an embodiment of the present invention.Span Generator Structure

FIG. 7 is a block diagram illustrating the Span Generator 101, accordingto an embodiment of the present invention. The Span Generator 101comprises

-   -   An Input Interface 105    -   3 Loaders 102    -   12 Edge Generators 103    -   4 cascaded 3-input Sorters 104    -   An Output Interface 106    -   A scissoring box module 107

Input Interface 105 packs input functionals for passing to the threeLoaders 102. Loaders 102 perform Edge Generator 103 initialization. EdgeGenerators 103 generate “left” and “right” edges, which are then sortedin tournament Sorters 104. The Sorters' 104 output is directed viaOutput Interface 106 to a Tile Generator (TG), the TG for converting aset of spans into a sequence of tiles, wherein a tile refers to arectangle set of pixels to be rendered.

Advantageously, the Span Generator 101 solves the following issues:

-   -   1. The Span Generator 101 produces spans for a triangle having        up to 15 functionals. The X and Y clipping is performed by the        scissoring box module 107, and thus 11 functionals remain. For        reasons described in items 3 and 4, there are 12 Edge Generators        103 in the Span Generator 101 architecture.    -   2. The Span Generator 101 generates at least two spans per clock        cycle, presenting a doubling of performance when compared to        generating one span per clock cycle, for 30% more cost.    -   3. In the case of a reduced set of functionals (i.e. fewer than        7 or 8) the Span Generator 101 can generate more than two spans        per clock cycle. In this case we use two Edge Generators 103 to        process the same functional. The Loaders 102 setup the Edge        Generators 103 at different spans according to the initial        offsets of the respective Edge Generators 103. Analogously, in        the case of fewer than 4 functionals, the span generation rate        reaches eight spans per clock cycle.    -   4. The Loaders 102 provide the maximal Span Generator 101        performance for the most general case, which is a case involving        3 functionals. Thus the Span Generator 101 comprises 3 Loaders        102, wherein a Loader 102 can load four Edge Generators 103        sequentially.    -   5. For non-adaptive over-sampling with a rotating grid, the Span        Generator 101 perform clipping by several half-planes with a        known tangent, a process that can be done using a separate        device.

External Assumptions of Data Formats Bits for Range representationComment Window size, [0 . . . 2¹² − 1] × [0 . . . 2¹² − 1] 12 To be ableto draw into 4096 × 4096 pixels texture Maximum 2²  2 Not the same asvertex subpixel grid, it divisions of is coarser. The functionalcoefficients oversampling will be given in the vertex subpixel grid gridper pixel while the x, y coordinates are in the oversampling one. Windowsize, [0 . . . 2¹⁴] × [0 . . . 2¹⁴] 15 Extreme window's pixels inrotated grid over-samples coordinates Vertex X, Y after [0 . . . 2¹⁴ +1] × [0 . . . 2¹⁴ + 1] 15 We need one more grid position on theclipping, over- right and bottom as otherwise the last samples. column(raw) of pixels cannot be drawn (with tight clipping) because ofopen/close convention, hence a value of 2¹⁴ + 1 is possible hereSubpixel vertex [0 . . . 2⁸ − 1]  8 Main grid for the triangle setupposition, per pixel (subpixel bits) Vertex X, Y after [0 . . . 2²⁰ + 1]× [0 . . . 2²⁰ + 1] 21 We need one more grid position on the clipping,sub- right and bottom as otherwise the last pixels units. column (raw)of pixels cannot be drawn (with tight clipping) because of open/closeconvention, hence a value of 2²⁰ + 1 is possible here

Internal Data Formats Bits for Range representation Comment Edgefunctional [−2²⁰ − 1 . . . 2²⁰ + 1] 21 + sign See below coefficientsa_(i), b_(i), see below Edge functional ±(2⁴⁰ + 2²¹ + 1) 41 + sign Seebelow coefficients (in a window coordinate system after setup) c_(i)(see below) Bounding box [0 . . . 2¹⁴] 15 Bounding box origin isinclusive; it origin (x_(min), y_(min)) values the first x position todraw and in oversampling the first span to draw (if span is not gridunits empty). The bounding box is defined as an original bounding box ofa triangle intersected with the scissoring box. If no scissoring boxexists, then the window box is used as a scissoring box. Edge functional±(2⁴⁰ + 2³⁵ + 2²¹ + 2¹⁵ + 1) 41 + sign coefficients after shifting tothe bounding box system c_(i), see below Bounding box [0 . . . 2¹⁴ + 1]15 Bounding box max point is inclusive; it maximum point values the lastx position to draw and the (x_(max), y_(max)) last span to draw (if spanis not empty). Non adjusted [0 . . . 2¹⁴ + 1] 15 The box with the widthof 0 can have a bounding box single pixel column inside, since bothwidth x_(max) − x_(min) sides of the box are inclusive. Extended2^([0 . . . 15])  4 Adjusted (extended) bounding box is bounding boxused in the interpolator, since the width width x_(max) − x_(min) is tohave a value of a power of two. rounded to the Note: the extended boxcan be wider next power of 2 then the window.Input Interface

The Span Generator 101 has the following input interface: Field Length,name bit Description L  1 The signal to start loading the first threefunctionals M  1 Mode: 0 - standard, 1 - wire-frame R  8 The width ofthe wire-frame line in the current grid units F  4 The number of thefunctionals. In the case of the wire- frame mode, the three LSB are themask for drawing the edges (0 indicates do not draw, 1 indicates draw),and MSB is a request to extent the bounding box by W/2 in all directionsA 22 × 3 The value of the a coefficients for the 11 functionals. 0 ifthe particular functional is not present B 22 × 3 The value of the bcoefficients for the 11 functionals. 0 if the particular functional isnot present C 42 × 3 The value of the c coefficients for the 11functionals. 0 if the particular functional is not present X0 15 Thestart x value for the left edge of the scissoring box X1 15 The start xvalue for the right edge of the scissoring box Y 15 The value of the ycoordinate in the top corner of the scissoring box Y0 15 The value ofthe y coordinate in the left corner of the scissoring box Y1 15 Thevalue of the y coordinate in the right corner of the scissoring box Y215 The value of the y coordinate in the bottom corner of the scissoringbox T  2 The tangent of the slope of the left edge of the scissoringbox, according to the following: 00 The right edge is vertical 01 Thetangent is 1 10 The tangent is 2 11 The tangent is 3 XMIN 15 The valueof the x coordinate for the left edge of the bounding box XMAX 15 Thevalue of the x coordinate for the right edge of the bounding box YMIN 15The value of the y coordinate for the top edge of the bounding box YMAX15 The value of the y coordinate for the bottom edge of the bounding boxWire-frame

We assume the wire-frame will be done as three functionals for edgesinside the tight bounding and scissoring boxes. That means we do notsupport clipping planes for wire-frame. The span generation for thewire-frame mode does not take anything special besides the Loader 102should supply corrected functional values for two nested triangles. Theinner triangle is a set of points on the current grid, which should beexcluded from the outer triangle. For an edge f(x, y)=a·x+b·y+c, thefunctional values for that two triangles will beƒ₁(x, y)=a·x+b·y+c+w/2-outer edgeƒ₂(x, y)=a·x+b·y+c−w/2- inner edgewhere w is a width of the wireframe edges.Loader

FIG. 8 is a block diagram illustrating a Loader 102 (without shifters),according to an embodiment of the present invention. Loader 102comprises the following inputs: SHORT xMin, yMin, xMax, yMax, a, b; LONGc; SHORT nF;  // the number of the functionals and outputs SHORT c0l,c0h, c1l, c1h, bl, bh, al, ah, m; BOOL dir, cor

Initially, a Loader 102 determines the global values, which are the samefor all of the functionals in the polygon. To accomplish this, theLoader 102 computes the parameters of the bounding box: SHORT w = xMax −xMin; m = ceiling (log2 (w)); SHORT W = 1 << m; // 2**m SHORT h = yMax −yMin; SHORT k = (nF > 6)? 1 : (nF > 3); SHORT aT, bT; // ã and {tildeover (b)}

Then for each functional the Loader 102 computes nCase = (a < 0 && b >=0)? 1 :  (a <= 0 && b < 0)? 2 :  (a > 0 && b <= 0)? 4 :  (a >= 0 && b >0)? 5 : 0;        // but the “0” is redundant BOOL cor = (nCase > 3)? 1: 0; BOOL dir = (nCase < 2 ∥ ncase > 4)? 0 : 1; LONG cT2 = C − a * xMin− b * yMin;  // {tilde over ({tilde over (c)})} switch (nCase) {  case1:   aT = a;   bT = b;   cT = cT2 − 1;              // {tilde over (c)}  break;  case 2:   aT = a;   bT = −b;   cT = −cT2 − a * W;  case 4:  aT = −a;   bT = −b;   cT = −cT2 − a;  case 5:   aT = −a;   bT = −b;  cT = cT2 + a * W;  }

The Loader 102 then computes two separate functional values for twosequential spans, and in the case of having k=1, 2, 4 Edge Generators103 per functional, the Loader 102 also computes values for all othertwo or six sequential spans: c [0] = cT − bT * (yMin % 2); c [1] = c[0] + bT; for (i = 1; i < k; i ++) {  c [i * 2 ] = c [i * 2 − 2] + 2 *bT;  c [i * 2 + 1] = c [i * 2 − 1] + 2 * bT;  } bT <<= k;

FIG. 9 is a block diagram illustrating {tilde over (b)} and {tilde over(c)} values wrapping before they are loaded into an Edge Generator 103,according to an embodiment of the present invention. The divisionalgorithm is described above (see Special Case). But if it is performedliterally then the ã value needs to be scaled before divisionmultiplying it by 2^(m), which scales ã out of short range. Neverthelesseach clock of division effective length of subtraction is still in theshort range, thus instead of scaling the ã value, the f(0, {tilde over(y)}_(k)) value is scaled by 2^(−m) before the division and then insteadof dividing the scaled ã value by 2 each clock, the scaled f(0, {tildeover (y)}_(k)) value is multiplied by 2. While a${\Delta\quad x} = {{floor}( {- \frac{\overset{\sim}{b}}{\overset{\sim}{a}}} )}$value is also needed, the {tilde over (b)} value is also pre-scaled. Thescaled f(0, {tilde over (y)}_(k)) value is longer than the non-scaledvalue. However, this does not necessitate a longer adder for performingthe moving-down process: The least significant bits of the scaled f(0,{tilde over (y)}_(k)) value are wrapped to the most significant bits(i.e. a cyclic rotation instead of an arithmetical shift), resulting inthe scaled f(0, {tilde over (y)}_(k)) value being expressed within thesame bit-length as the non-scaled value. To avoid carry propagation fromMSB to LSB, invert the sign bit before loading data into an EdgeGenerator 103. In the case of f(0, {tilde over (y)}_(k))<0 this bitwould be 0 and would not propagate a carry. To detect if f(0, {tildeover (y)}_(k))≧0, compare this bit to 1. The {tilde over (b)} value isscaled in a similar way, with the difference that it is not wrapped.

At the first clock cycle of the division process, Edge Generator 103determines whether one of the f(0, {tilde over (y)}_(k)) or {tilde over(b)} values exceed the boundaries, i.e. it determines whether thedivision result would be greater than or equal to W. For that purpose,the real scale factor is not m, but m+1. The division works in theabove-described way, but if the result is not below W, either x₀ will bebeyond the bounding box limit or the result after the first Bresenhamstep would be beyond the bounding box limit.

The Loader 102 loads the Edge Generators 103 sequentially, starting fromthe first three functionals of each triangle, with the first functionalloaded into the first Edge Generator 103, and so on. If there are onlythree functionals, the Loader 102 loads other Edge Generators 103 withthe functional values for other three groups of spans on the nextsequential clock cycles.

Considering the input interface and the approach of loading several EdgeGenerators 103 at subsequent clock cycles, the pseudo-code for theLoader 102 is as follows: template <int N> void Loader<N> ( //pipelined,performed each clock      // input interface:  bool L, // the firstclock of loading the L = 1  bool M, // M = 1 in the wire-frame mode BIT2 Os,// oversampling grid to pixel grid relation:      // 0 − 4x, 1− 2x, 2 − 1x  BIT8 R, // the width of a wire-frame line  BIT4 F, // thenumber of functionals, edge mask in wire- frame mode  SHORT A, // thefirst coefficient  SHORT B, // the second coefficient  LONG C,  // thefree member  BIT21 XMIN,  // the left edge of the bounding box  BIT21YMIN,  // the top edge of the bounding box  BIT21 XMAX,  // the rightedge of the bounding box  BIT21 YMAX,  // the bottom edge of thebounding box  BIT21 XFUN,  // the X coordinate of the zero functionalpoint  BIT21 YFUN // the Y coordinate of the zero functional point  ) { BIT3 toGo = (L)? 4 : toGo − 1; // counts the number of // functionalsto load  BIT3 nClk = (L)? 0 : nClk + (toGo != 0); // counts loading //clocks  if (L) {   bool wf = M,   BIT8 wW = R,   BIT15 xMIN = ((xMIN >>5)   BIT4 nFunct = F,   BIT2 k = (wf)? 2 : (nFunct < 4)? 4 : (nFunct <7)? 2 : 1;   BIT21 w = XMAX − XMIN;   BIT21 h = YMAX − YMIN;   BIT4 m =ceiling (log2 (w));   BIT16 W = 1 << m; // 2**m   BIT15 xMin = XMIN;  BIT15 yMin = YMIN;   }  BIT3 nCase =   (A < 0 && B >= 0)? 1 :   (A <=0 && B < 0)? 2 :   (A > 0 && B <= 0)? 4 :   (A >= 0 && B > 0)? 5 :   0;// redundant, not used  BOOL cor = nCase > 3;  BOOL dir = nCase >= 2 &&nCase < 5;  SHORT aT = (A >= 0)? −A : A; // ã  SHORT b2T = (B < 0)? −B :B; // {tilde over ({tilde over (b)})}  LONG cT2 = C − // {tilde over({tilde over (c)})}   A * (xMin − XFUN) −   B * ((yMin & −(k << 1) −YFUN); // align spans by Y  cT2 >>= (6 + Os); // shift to get  LONG CT =((dir)? −cT2 : cT2) // {tilde over (c)}   + (nCase == 1 ∥ nCase == 4)?−1 :   (dir)? −A << m : A << m;  LONG c [8];  LONG b [4];  c [7] = c[5] + (b2T << 1); // pipelining  c [6] = c [4] + (b2T << 1);  c [5] = c[3] + (b2T << 1);  c [4] = c [2] + (b2T << 1);  c [3] = c [1] + (b2T <<1);  c [2] = c [0] + (b2T << 1);  c [0] = cT − (yMin & 1)? b2T : 0;  c[1] = c [0] + b2T;  SHORT bT = b2T << (1 << k); // << 2, 4, 8Edge Generator

FIG. 10 a is a block diagram illustrating an Edge Generator, accordingto an embodiment of the present invention. The Edge Generator 103comprises four 24-bit adders and eight 24-bit registers. An adder hasthe outputs of two registers as inputs, wherein the inputs of theregisters are multiplexed: SHORT reg [8]; bool carry [4]; SHORT add [4];add [0] = req [0] + reg [4] + carry [0]; add [1] = reg [1] + reg [5] +carry [1]; add [2] = reg [2] + reg [4] + carry [2]; add [3] = reg [3] +reg [7] + carry [3]; for (1 = 0; i < 8; i ++)  reg [i] = some_function(add [k], reg [k], ...);

The registers' outputs are supplied directly to inputs of adders tominimize a delay at the adders. The structure of multiplexers allows usto minimize a delay at them also, the maximal post-adder delay supposedto be not more than 3×1 multiplexer.

Besides the implementation of the general functionality, themultiplexers are also performing loading and stalling operations bywriting a new set of data or a previous state of an Edge Generator 103back to registers.

The basic functionality of an Edge Generator 103 comprises three mainphases: moving-down, Bresenham setup and Bresenham walk. There are alsoseven interim states, which are: load, stall, first clock of movingdown, transfer from moving down to Bresenham setup, two different clocksof transfer from Bresenham setup to Bresenham walk and finally firstclock of the Bresenham.

An Edge Generator 103 has the following inputs: SHORT c0l, c0h, c1l,c1h, bl, bh, al, ah, m; BOOL dir, cor, load, stall;

When the load signal is set, the Edge Generator 103 stores the inputvalues in internal registers and resets its state. When the stall signalis set, the Edge Generator 103 registers retain their content for thecurrent clock cycle.

Edge Generator: Moving Down

FIG. 10 b is a block diagram illustrating an Edge Generator 103 duringthe moving-down phase, according to an embodiment of the presentinvention. The functional value is accumulated in the register, whichwas loaded with the value of {tilde over (c)} at the start. At thisphase each Edge Generator 103 performs the following: while (c < 0) {  c= c + b;  }

Applying this to the hardware, we obtain: // at loading stage SHORT mm =bitlength (SHORT) − logm − 1; SHORT mask_b = 1 << mm; SHORT mask_a =mask_b − 1; SHORT mask_o = −1 << (mm − 1); SHORT clock = 0, repeat = 1;;bool pl, ph, rl = 0, rh = 0; while (repeat) { pl = rl ph = rh; if (clock== 0) { clock = 1; if (ch & mask_b) { repeat = 0; continue; } rl = carry(cl + bl + ph); cl += bl; } else { rl = carry (cl + bl + ph); cl += bl +ph; if ((ch & mask_a | masK_o) == −1 && rl)   { repeat = 0; continue; }rh = carry (ch + bh + pl); ch += bh + pl; } }

The masks are for preliminary zero crossing detection, and their useallows avoiding “backing-down” the functional value, since the data isnot written back to ch and the LSBs of ch remain intact. The masks alsoallow detection of a zero crossing one clock cycle earlier.

Setup

FIG. 10 c is a block diagram illustrating an Edge Generator during theBresenham setup phase, according to an embodiment of the presentinvention. The division algorithm was described above under “SpecialCase”, and is implemented as follows: // after moving down setup: ch &=˜ (mask_a | mask_b); // setup while (m != 0) {  BOOL c0l_c = carry(c0l + a);  if (!c0l_c) c0l += a;  c0l <<= 1;  c0l |= carry (c0h << 1); c0h <<= 1;  c0h |= !c0l_c;  BOOl c1l_c = carry (c1l + a);  if (!c1l_c)c1l += a;  c1l <<= 1;  c1l |= carry (c1h << 1);  c1h <<= 1;  c1h |=!c1l_c;  BOOL bl_c = carry (bl + a);  if (!c0l_c) c01 += a;  bl <<= 1; bl |= carry (bh << 1);  bh <<= 1;  bh |= !bl_c;  m = m − 1;  }Bresenham Walk

After the Bresenham setup process completes, the four values ch, cl, bhand bl are produced, indicating the Bresenham error, x₀, positivecorrection value and Δx, respectively. To perform edge generation wealso need a negative correction value r₁. The Loader 102 sets theBoolean variables dir and cor. Setting the variable dir to 1 indicatesthat the Edge Generator 103 subtracts the x value from W. Setting thecor variable to 1 indicates that the Edge Generator 103 adds 1 to the xvalue. If the x value overflows, an appropriate flag is set depending onthe value of the dir variable. // after setup SHORT fm = (1 << m) − 1;// negation mask = W − 1 SHORT nm = (dir)? fm : 0; // negate if dir == 1SHORT om = ˜fm;   // overflow mask to detect x < 0 or x >= W #define erch #define x0 cl #define r0 bh #define r1 a #define dx bl x0 = (nm{circumflex over ( )} x0) + cor; // x0 = W − 1 − x0 + cor if (dir)   dx= ˜dx; r1 = a + b;  // a is negative, so r1 = |b| − |a| // er = er + a;// but we perform it later at first clock of // // Bresenham // at thispoint some values are moving to different registers // according togeneral structure of the EG // first clock SHORT clock = 0; BOOL uf =false, ov = false; while (1) {   if (x0 & om)     if (dir) uf = true; //x0 must be negative     else  ov = true; // x0 must be >= W   if (uf ∥ov) continue;// do not update registers   if (clock == 0) {     clock =1;     x0 += dx + dir;     er += r1;     }   else {     x0 += dx +(er >= 0)? 1 − dir : dir;     er += (er >= 0)? r1 : r0;     }   }Divider-by-3

For the Scissoring Box, a divider-by-3 is used to multiply the y offsetby ⅓. A pseudo-code for a 15-bit divider-by-3 is as follows: #definebit(a,n,m) ((a >> n) & ((1 << (m − n + 1)) − 1))   // not correct interms of the ANSI C, but works in our case #define bitrev(a) (((a &2) >> 1) | ((a & 1) << 1)) #define simp(a,b,c,d) ((˜d & ˜c & b | c & a |d & ˜b & ˜a) & 1)   // single-bit operation #define remh(a,b) simp (a,a >> 1, b, b >> 1) #define reml(a,b) simp (a >> 1, a, b >> 1, b) #definerems(a,b) ((remh (a, b) << 1) | reml (a, b)) #define sim1(a,b,c) ((˜c &b | c & ˜b & ˜a) & 1)   // single-bit operation #define sim2(a,b,c) ((˜c& a | c & b   ) & 1)   // single-bit operation #define remc(a,b) sim1(a, a >> 1, b) #define remd(a,b) sim2 (a, a >> 1, b) #define reme(a,b)((remc (a, b) << 1) | remd (a, b)) #define remf(a,b) bitrev (reme(bitrev (a), b)) bit16 div (bit15 a) { bit15 c = a & 0x2aaa, d = a &0x1555;   c = (c & ˜(d << 1)) | (˜(c >> 1) & d);  // canonise bit1 a14 =bit (a, 14, 14), a13 = bit (a, 13, 13), a11 = bit (a, 11, 11), a09 = bit(a, 9, 9), a07 = bit (a, 7, 7), a05 = bit (a, 5, 5), a03 = bit(a, 3, 3), a01 = bit (a, 1, 1); bit2 part0 [ 7] = { bit (c, 0, 1),//bits 00, 01 bit (c, 2, 3), //bits 02, 03 bit (c, 4, 5), //bits 04, 05bit (c, 6, 7), //bits 06, 07 bit (c, 8, 9), //bits 08, 09 bit (c, 10,11), //bits 10, 11 bit (c, 12, 13) //bits 12, 13 }, bit2 part1 [ 8] = {rems (part0 [ 1], part0 [ 0] ), reme (part0 [ 1], a01 ), rems (part0 [3], part0 [ 2] ), reme (part0 [ 3], a05 ), rems (part0 [ 5], part0 [ 4]), reme (part0 [ 5], a09 ), remf (part0 [ 6], a14 ), a13 & ˜a14 }, bit2part2 [ 8] = { rems (part1 [ 2], part1 [ 0]), rems (part1 [ 2], part1 [1]), rems (part1 [ 2], part0 [ 1]), reme (part1 [ 2], a03    ), rems(part1 [ 6], part1 [ 4]), remh (part1 [ 6], part1 [ 5]), reml (part1 [6], part0 [ 5]), remc (part1 [ 6], all    ), }, bit2 part3 [8] = { rems(part2 [ 4], part2 [ 0]), remh (part2 [ 4], part2 [ 1]), reml (part2 [4], part2 [ 2]), remh (part2 [ 4], part2 [ 3]), reml (part2 [ 4], part1[ 2]), remh (part2 [ 4], part1 [ 3]), reml (part2 [ 4], part0 [ 3]),remc (part2 [ 4], a07    ) }; bit14 m = bit (a, 0, 13) {circumflex over( )} ( ((part3 [ 0] & 1) << 0) | ((part3 [ 1] & 1) << 1) | ((part3 [ 2]& 1) << 2) | ((part3 [ 3] & 1) << 3) | ((part3 [ 4] & 1) << 4) | ((part3[ 5] & 1) << 5) | ((part3 [ 6] & 1) << 6) | ((part3 [ 7] & 1) << 7) |((part2 [ 4] & 1) << 8) | ((part2 [ 5] & 1) << 9) | ((part2 [ 6] & 1) <<10) | ((part2 [ 7] & 1) << 11) | ((part1 [ 6] & 1) << 12) | ((part1 [ 7]& 1) << 13)); return (m << 2) | part3 [0]; // pack the reminder together}Scissoring Box and Synchronization

FIG. 11 a is a block diagram illustrating a Scissoring Box origin,according to an embodiment of the present invention. The Span Generator101 comprises a Scissoring Box module 107 for providing scissoring by aview-port, rotated relative to the x and y axes by an angle with tangent0, 1, ½ and ⅓ (hereinafter also referred to as tangent 0, 1, 2, 3,respectively). The vertical coordinate y₀ of the upper-left corner ofthe rotated Scissoring Box is 0, and the horizontal coordinate x₁ of thelower-left corner is also 0. Optionally, the Scissoring Box can be usedin an optional embodiment of the present invention having anover-sampling scheme.

The Scissoring Box has its origin specified by four points. Thecoordinates of the points are calculated by the driver (i.e. thesoftware controlling the graphics chip) and stored in registers. The ycoordinate of the upper corner is y₀=0. The Scissoring Box deviceperforms calculation of the initial Scissoring Box coordinates for thefirst span. After that, the Scissoring Box device calculates up to eightScissoring Box coordinates per clock cycle for current spans.

FIG. 11 b is a block diagram illustrating a Scissoring Box, according toan embodiment of the present invention. The device to draw theScissoring Box generates spans between two edges of the Scissoring Box.Two parts of the Scissoring Box generate both edges using theinformation about starting values of x and y coordinates, y coordinatesof corners and rotation angle tangent: void ScissoringBox ( SHORT x0,  //starting left x  SHORT x1,  //starting right x SHORT y,  //starting y (Ymin)  SHORT y0, //y coordinate for left corner SHORT y1, //y coordinate for right corner  SHORT y2, //ending y (Ymax) char t){ //2-bit tangent expression  char cnt0 = t, cnt1 = t;  while (y< y2) {   bool m0 = y >= y0;   x0 += (t)? ((m0)? t : (cnt0)? 0 : −1) :0;   cnt0 = (m0)? t : (cnt0)? cnt0 − 1 : cnt0;   bool m1 = y < y1;  x1 += (t)? ((m1)? t : (cnt1)? 0 : −1) : 0;   cnt1 = (m1)? t : (cnt1)?cnt1 − 1 : cnt1;   }  }The pair of the coordinates x₀, x₁ is then sorted among the edgecoordinates by Edge Generator 103.Sorter

As illustrated in FIG. 7, a Sorter 104 is a four-input treecompare/multiplex hardware device, three inputs of which are coupled tooutputs of three Edge Generators 103 operating within the same clockcycle, and one input of which is coupled to an output of the Sorter 104operating in the previous clock cycle. In an embodiment comprising fourgroups of Edge Generators 103 there are four Sorters 104. Each EdgeGenerator 103 delivers the direction of a half-plane (left or right) asa tag for the x coordinate value. A Sorter 104 compares x values foredges of different types separately. typedef struct { //the output of anEG   int x, //the position   bool uf, ov; //beyond the bounding box  bool dir; //left (0) or right (1)   } edge_out; class temp_span {public:   int x0, x1; //left and right   bool uf0, ov0;// left beyondthe bounding box   bool uf1, ov1;// right beyond the bounding box  temp_span ( ) :     x0 = 0, x1 = 0,     uf0 = false, ov0 = false,    uf1 = false, ov1 = false { };   temp_span (edge_out ed) : temp_span( ) {     if (ed. dir){// if the edge is right, then it is the maximal x      x1 = ed. x;       uf1 = ed. uf;       ov1 = ed. ov;       }    else   {// if the edge is left , then it is the minimal x       x0 =ed. x;       uf0 = ed. uf;       ov0 = ed. ov;       }     };   };temp_span sort_two (   temp_span s0,   temp_span s1   ) {   temp_spanresult;   bool x0m = s0. uf0 ∥ s1. ov0 ∥ // compare flags     (!s1. uf0&& !s0. ov0 && s0. x0 < s1. x0); // and values   bool x1m = s0. ov1 ∥s1. uf1 ∥     (!s1. ov1 && !s0. uf1 && s0. x1 >= s1. x1);   result. x0 =(x0m)? s1. x0 : s0. x0; // max of left   result. uf0 = (x0m)? s1. uf0 :s0. uf0;   result. ov0 = (x0m)? s1. ov0 : s0. ov0;   result. x1 = (x1m)?s1. x1 : s0. x1; // min of right   result. uf1 = (x1m)? s1. uf1 : s0.uf1;   result. ov1 = (x1m)? s1. ov1 : s0. ov1;   return result;   }temp_span sorter {   temp_span s0, //The output of the previous Sorter  edge_out  x1, //The first EG output   edge_out  x2, //The second EGoutput   edge_out  x3 //The third EG output   ) {   temp_span s1 (x1),s2 (x2), s3 (x3);   return sort_two (     sort_two (s0, s1),    sort_two (s2, s3)     );   }Span Buffer Interface

The Span Buffer interface (also known as the Output Interface 106 shownin FIG. 7) converts the last Sorter output to absolute coordinates (notethat the values are bounding box relative from the Loaders 102 throughthe Sorters 104) and packs them into the Span Buffer.

At this point of the span generation process, the computed valuescomprise the output of the last Sorter s3 and the bypassed outputs ofthe three other Sorters s0, s1 and s2. Also available are the current ycoordinate, the x_(min) and X_(max) parameters of the bounding box, andk=1, 2, 4 representing the number of Edge Generators 103 computing spansfor the same functional. Also note that the Sorters 104 are doubled,since at the lowest rate there are two spans generated per clock cycle,and therefore two spans are processed per clock cycle in parallel. boolwf;    //wire-frame mode wf == true bool update = true; //when a newtriangle starts SB should get new y SHORT xMax, xMin; SHORT y;    //fromthe current y counter SHORT w = xMax − xMin;    // The real bounding boxsize typedef struct { x0, x1; // the values −1 and −2 are reserved for// uf and ov accordingly. } span; SHORT sb_cnt = 0; //the counter ofposition in SB row span spare_buffer [16]; void WriteNextToSB (   spansp, //span to write   SHORT pos, //position in the SB row   bool next//next row   ); void move_sb_cnt ( ) {   sb_cnt ++;   if (sb_cnt >= 8) {   Span_Buffer. Write (       // see TG doc for description    spare_buffer,     y & (0xfffffff0 ∥ wf << 3),  // y is aligned    update, // update y if necessary     wf,  // wire-frame     update);   update = false;    sb_cnt = 0;   } } void sb_interface (   temp_spans [8], //two first - from the first level Sorter,          // two next -from second level, etc.   SHORT  y,  //from the y counter   SHORT  xmin,//from the input   SHORT  xmax, //from the input   SHORT  k //from theinput   ) {   int j;   span sp [8]; //temporary spans   //actually thefollowing is performed at Sorters outputs   for (j = 0; j < 8; j++)     {    sp [j]. x0 =     (s [j]. ov0)? MAX_INT :     (s [j]. uf0)?−1 :     (s [j]. x0 > w)? MAX_INT : s [j]. x0 + xMin;    sp [j]. x1 =    (s [j]. ov1)? MAX_INT :     (s [j]. uf1)? −1 :     (s [j]. x1 > w)?MAX_INT : s [j]. x1 + xMin;    }   if (update) {    sb_cnt = y & (wf)?0x7 : 0xf;    for (j = 0; j < 16)     spare_buffer [j]. x0 =spare_buffer [j]. x1 = 0;    }   if (wf) {     bool empty [4]; //emptyflags for all 4 spans of the           // internal triangle     for (j =0; j < 4; j ++) {      empty [j] =       s [j + 4]. x0 == MAX_INT ∥      s [j + 4]. x1 == −1 ∥       s [j + 4]. x0 > s [j + 4]. x1;     spare_buffer [sb_cnt + j ]. x0 = sp [j]. x0;      if (empty [j]) {      spare_buffer [sb_cnt + j ]. x1 = sp [j ]. x1;       spare_buffer[sb_cnt + j + 8]. x0 = MAX_INT;       spare_buffer [sb_cnt + j + 8]. x1= −1;       }      else {       spare_buffer [sb_cnt + j ]. x1 = sp [j +4]. x0;       spare_buffer [sb_cnt + j + 8]. x0 = sp [j + 4]. x1;      spare_buffer [sb_cnt + j + 8]. x1 = sp [j ]. x1;       }    move_sb_cnt ( );     }     sb_cnt = (sb_cnt + 4);     return;     }   for (j = (k − 1) * 2; j < k * 2; j ++) {     if (k == 2 && j ==4)    // when k == 2 we use first and third       j == 6;    // sortersoutputs     spare_buffer [sb_cnt] = sp [j];     move_sb_cnt ( );     }   }

Foregoing described embodiments of the invention are provided asillustrations and descriptions. They are not intended to limit theinvention to precise form described. In particular, it is contemplatedthat functional implementation of invention described herein may beimplemented equivalently in hardware, software, firmware, and/or otheravailable functional components or building blocks, and that networksmay be wired, wireless, or a combination of wired and wireless. Thepseudo-code fragments represent high-level implementation examples andare intended to illustrate one way of implementing functionalitiesdescribed herein. Other variations and embodiments are possible in lightof above teachings, and it is thus intended that the scope of inventionnot be limited by this Detailed Description, but rather by Claimsfollowing.

1. A method for rasterization, comprising the steps of: determining acrossing point of a line and a bounding box, the line represented by afunctional ƒ(x, y)=a·x+b·y+c; and performing a Bresenham walk along aportion of the line, the portion falling within the bounding box, theportion having (a) an initial x-coordinate x₀ and (b) one or moreadditional x-coordinates with increasing x-values; wherein the findingstep and the performing step comprise using an adder, and wherein thefinding step and the performing step do not comprise using a multiplieror a divider.
 2. The method of claim 1, wherein the adder comprises ashort-adder for iteratively subtracting a from b and from c.
 3. Themethod of claim 1, wherein the initial x-coordinate x₀=floor(−c/a) andthe one or more additional x-coordinates increase by at leastΔx=floor(−b/a), and wherein the adder comprises a short-adder foriteratively subtracting a from c for computing x₀ and for iterativelysubtracting a from b for computing Δx.
 4. A method for rasterization,comprising the steps of: receiving (a) a first set of coefficientsrepresenting a first functional, and (b) a bounding box offset; andcomputing a second set of coefficients a, b and c representing a secondfunctional ƒ(x, y)=a·x+b·y+c, the second functional falling within aquadrant indicated by a<0 and b≧0; wherein the computing step comprisesscaling an intermediate value, the scaling comprising a cyclicbit-rotation of the intermediate value, whereby the intermediate valueis represented within a first bit-length, a scaled version of theintermediate value is represented within a second bit-length, and hesecond bit-length does not exceed the first bit-length.
 5. The method ofclaim 4, further comprising the step of passing from a first coordinatesystem to a second coordinate system, the first functional according tothe first coordinate system and the second functional according to thesecond coordinate system.
 6. The method of claim 5, wherein the firstcoordinate system corresponds to a main grid and the second coordinatesystem corresponds to an over-sampling grid.
 7. A method for providingscissoring by a view-port, comprising the steps of: receiving (a) afirst pixel span expressed relative to an x axis and a y axis, (b) oneor more coordinates specifying a scissoring box relative to the x and yaxes, and (c) a scissoring box rotation angle tangent expressed relativeto the x and y axes; and computing a second pixel span, the second pixelspan expressed relative to the scissoring box; wherein the scissoringbox indicates a rotated clipped rectangle.
 8. The method of claim 7,wherein the rotation angle tangent is 0, 1, ½ or ⅓.
 9. The method ofclaim 8, wherein the computing step comprises a hardware element forimplementing a divide-by-3 algorithm, the hardware element comprising ashort-adder.